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Abstract 

We discuss the "soft-ratcheting" algorithm which generates tar- 
geted stochastic trajectories in molecular systems with scores corre- 
sponding to their probabilities. The procedure, which requires no ini- 
tial pathway guess, is capable of rapidly determining multiple path- 
ways between known states. Monotonic progress toward the target 
state is not required. The soft-ratcheting algorithm is applied to 
an all-atom model of alanine dipeptide, whose unbiased trajectories 
are assumed to follow overdamped Langevin dynamics. All possi- 
ble pathways on the two-dimensional dihedral surface are determined. 
The associated probability scores, though not optimally distributed at 
present, may provide a mechanism for estimating reaction rates. 



* Current address: Center for Computational Biology & Bioinformatics, University of 
Pittsburgh, Pittsburgh, PA 15213, dzuckerman@ceoh.pitt.edu 



1 



1 Introduction 



Reaction paths in large molecular systems, such as biomolecules, provide crit- 
ical information regarding structural intermediates (transitions states) and 
barrier heights. The search for these paths has a long history in the applied 
math research commnity (e.g., as well as in the field of biomolecular 
computation [0, |, § |, |, 0, |, |, |10|, [ll|, Q, |13|, 0, ^ 0. Many 
approaches must start from an initial guess for the reaction path (such as 
a straight line between two states), effectively limiting the search to a sin- 
gle pathway. On the other hand, "targeted" and "steered" MD approaches 
1^, 1^, |TD[ are capable of finding multiple pathways by repeated simulation 
(from differing initial conditions) forced to reach the desired end state. 

The recently-introduced soft-ratcheting approach |TB| is also capable of 
"blindly" determining multiple reaction pathways. It differs from the tar- 
geted and steered approaches in the following ways: (i) monotonic progress 
toward the target state is not enforced, permitting a wider range of reaction 
pathways; (ii) soft-ratcheting is applied in the context of stochastic dynam- 
ics, although this does not prevent the inclusion of explicit solvent molecules; 
and (iii) a probability weight ("score") is associated with each trajectory gen- 
erated, which in principle also permits estmates of the reaction rates within 
the dynamic importance sampling (DIMS) formulation discussed by Woolf 



[jT9[ and by Zuckerman and Woolf ||20| , |2l[| . We note that reaction-rate esti- 
mates have not yet been produced by the soft-ratcheting algorithm, because 
such estimates require trajectories sampled with a near-optimal distribution 
(i.e., as would occur in unbiased dynamics; see below). 

The soft-ratcheting procedure is simple and is motivated by the Metropo- 
lis algorithm |^| and the "exponential transformation" used in nuclear im- 
portance sampling (e.g., p3[). Related methods include the "weighted- 
ensemble Brownian dynamics" approach of Huber and Kim and the 
"CONTRA MD" scheme of Harvey and Gabb 0. The process is: (a) gen- 
erate an unbiased step; (b) if the step moves toward the target state, accept 
it; (c) if it moves away, accept it with a probability (i.e., "softly") that in- 
creases in the forward direction; (d) repeat, while estimating the probability 
score for all accepted steps. We emphasize that the non-monotonicity em- 
bodied in (b) and (c), and the existence of the score in (d) distinguish this 
method from previous multiple-pathway methods. The guiding concept be- 
hind soft-ratcheting is not to force the system (which necessarily perturbs 
the dynamics) but to try to allow the system to proceed in a possible, if 
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unlikely, way. Of course, rare stochastic events are just what we seek! 

Note too that, unlike the trajectory sampling methods introduced by 
Pratt 
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T7| , the overall 



and pursued by Chandler and coworkers 
well as those of Eastman, Gr0nbech- Jensen and Doniach 
effect of the soft-ratcheting algorithm is non-Metropolis in nature (despite 
the motivation): trajectories do not evolve from one another and are statis- 
tically independent. The Metropohs idea is only used to ensure that a given 
trajectory successfully reaches the target state. In this important sense, the 
soft-ratcheting algorithm comes under the independent-trajectory rubric of 
the DIMS method [|T9|, EOl, M 



2 Theory 

2.1 Generating Paths 

In essence there is no more theoretical underpinning to producing soft-ratcheted 
trajectories than that already sketched in the Introduction: using a physically- 
but-arbitrarily chosen acceptance probability function for step increments, one 
accepts all forward steps, and backward steps are accepted with a probability 
which decreases the more "negative" the increment. See Fig. |} Here, the 
forward direction is simply some vector in configuration space that points 
from the initial to the target state — perhaps a dihedral angle in a dihedral 
transition. The algorithm is sufficiently robust (see Results section) that ad- 
vance knowledge of the reaction path and the true reaction coordinate is not 
necessary. 

When generating a series of soft-ratcheted crossing events in a single long 
trajectory, it is convenient to use a simple threshold device This means 



only that trajectories are permitted to perform unbiased dynamics in small 
regions near the "centers" of the beginning and end states, and biased (i.e., 
soft-ratcheted) dynamics begin only when the threshold is reached. The idea 
is to allow the trajectory to explore different parts of the stable states, with 
an eye toward finding exit points to different pathways. Such exploration, 
of course, must take place within the limits of available computer time! As 
noted below, our use of the threshold requires further investigation and op- 
timization, though it appears to perform the task of permitting exploration 
of alternative exit points from a stable state. 
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2.2 Scoring Paths 

It is only when one wishes to associate a score with a trajectory that some 
analysis must be undertaken. The dynamic importance sampling (DIMS) 
approach requires the probability score for use in rate estimates, moreover. 
Specifically, the probability score used in DIMS calculations is the ratio of 
two quantities [^, ^ (i) the probability that the given trajectory would 
occur in an unbiased simulation — a known quantity; and (ii) the probabilty 
that the given trajectory was produced in the biased (e.g., soft-ratcheting) 
simulation. Further details of the full DIMS formulation may be found in 
Refs. [rU, ^ and are beyond the scope of the present report. Here we 



focus solely on computing the probability that the soft-ratcheting algorithm 
produced a given trajectory (ii), which unfortunately does not follow directly 
from the simple acceptance probability used to generate the trajectory. 

This section gives full details of generating the probability score (i.e., 
ratio) required by DIMS. Briefiy, however, assume progress towards the tar- 
get is measured in terms of a scalar "distance," which is larger at the 
target state than the initial: each step corresponds to an increment A(y9, 
with positive increments moving toward the target. From any starting con- 
figuration x„_i, one can define the unbiased distribution of A(f increments 
p^^{Aip;yin\^n-i), which is simply the projection of the more fundamental 
distribution of configurations, x„, onto the A(p coordinate. The distribution 
of Aip increments typically is nearly Gaussian with a mean which may be ei- 
ther positive or negative. However, once certain backward steps are rejected 
due to the acceptance function in the soft-ratcheting procedure (specified 
below), the Aip distribution is shifted forward in a non-trivial way to become 
the biased distribution, b^^{Aip;yin\^n-i)- Estimating the ratio of values of 
these two distributions for every accepted step (though not the entire distri- 
butions) is the task at hand. 

The multi-dimensional case reduces to a simple scalar description in terms 
of A(p increments, but we include it for completeness. We assume (although it 
is not necessary for the formalism) that the initial and final states of interest 
in our molecule do not require the full all-atom configuration x, but rather a 
subset of coordinates, say, {0i, 02, • • •}• If the target point is the "center" of 
state B, say, {</>f }, then one can always measure the distance to that point. 



E 



1 1/2 



(1) 



4 



where it may be necessary to consider the closest distance if the 0^ coordinates 
represent true angles. For a step from x„_i to x„, one can then define a one- 
dimensional change in distance by 

A^(n-1 ^ n) = dMt^}) - ds{{ct^t'^}) • (2) 

In essence, since distance from the target is always a scalar quantity, one need 
only consider a one-dimensional description to estimate probability scores. 




Figure 1: An example acceptance function for use in the soft-ratcheting 
algorithm. The acceptance probabihty pacc for a given step is plotted against 
the "distance" toward the target in angle space. Ay?. Steps toward the target 
(A(/? > 0) are always accepted, while steps away from the target (Ay? < 0) 
are accepted with probability less than one. Thus, trajectories are not forced 
toward the target, but "softly" ratcheted. 

The acceptance function for A(/9 increments is very simple and is specified 
by the simulator. The function used in the present work is illustrated in Fig. 
|T] and is written 

(A \- \ ^ if A(/.>0 

PacclAv^j - < [_| A^/A<^oP] if A<^ < , ^"^^ 
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where Ay^o is a parameter which controls the width of the (backwards) decay 
depicted in Fig. |l]. The gradual decay to zero is the "softness" of soft- 
ratcheting: many backwards steps will be accepted. 

With Pace specified, the final task toward generating the required probabil- 
ity score is to consider the relation between the unbiased and soft-ratcheted 
(biased) distribution. The probability (density) that the soft-ratcheting algo- 
rithm will generate a given (p increment, 6ai^; is proportional to the product 
of the unbiased probability of generating the increment, pa(/p, and the accep- 
tance probability, pacc: 

6av(A(/7) = 7Vr-VA^(Av?) Pacc(Av?) , (4) 

where < 1 is the required normalization factor, given by the fraction 
of steps initiated at x„_i which would be accepted by the soft-ratcheting 
procedure. As noted, the biased distribution, h/^^, has been shifted forward 
in the tp direction because the acceptance function pacc partially suppresses 
backward steps. 

The desired probability score for a single step is then the ratio deriving 
from (^, namely, 

single-step ratio = ^^'^{^'^^ = — . (5) 

6a<^(Av2) Pacc(A(y9) 

To truly calculate the normalization factor A/", one would have to initiate 
a large number of steps from the point x„_i and compute the fraction ac- 
cepted by the soft-ratcheting acceptance function. As that procedure would 
be very computationally expensive, we instead use the sequence of nearby at- 
tempted steps, both accepted and rejected, to estimate the probability that 
soft-ratcheted steps were accepted in a given local region of configuration 
space. The final score is simply the product of the single-step scores (^. 



3 Results 

The results of this preliminary report may be summarized in three points: 
(i) the soft-ratcheting algorithm is capable of generating reaction pathways 
rapidly — in a fraction of the time which would be required by unbiased 
simulation: see Fig. ^ (ii) the scores associated with each crossing trajectory 
permit the generation of a most-important ensemble of events as in Fig. 
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^ which can give more detailed information about the full "valley" of the 
pathway; and (iii) the associated scores, in principle, permit rate estimates 
within the dynamic importance sampling formulation |]T9|, ^ 
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Figure 2: Rapid generation of crossing trajectories with the soft-ratcheting 
algorithm. The figure shows both crossing trajectories generated by unbiased 
simulation (dark lines) and those generated by the soft-ratcheting algorithm 
in a fraction of the uniased simulation time (dashed lines, "DIMS"). The 
potential is AMBER94 p5[ as encoded in the Molecular Modelling Tool Kit 



26| for an all-atom representation of alanine dipeptide, and the unbiased 



trajectories were generated using overdamped Langevin dynamics. 



In Figure 0, one sees the rapidity with which the soft-ratcheting algorithm 
generates crossing trajectories. The same three pathways are found in l/70th 
of the simulation time. In absolute terms, the 10 nsec. of simulation time used 
in generating the soft-ratcheting trajectories appears quite long; however, this 
time may be significantly reduced by adjusting the threshold level (see Sec. 
ID from the preliminary value used to generate the depicted results. 

Figure ^ illustrates the capacity of the soft-ratcheting algorithm to gener- 
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Figure 3: Accuracy of top-scoring crossing trajectories generated with the 
soft-ratcheting algorithm. The figure shows both crossing trajectories gen- 
erated by unbiased simulation (dark lines) and the top-scoring trajectories 
generated by the soft-ratcheting (dashed lines, "DIMS"). Note that the large 
ensemble of soft-ratcheted trajectories appear to better explore the full path- 
way "valleys." Data are from simulations of equal length, with the same 
potential and dynamics as in Fig. |[ 

ate an "important" (highly weighted) ensemble of crossing events. The large 
set of trajectories shown in the figure clearly gives a better description of the 
pathway valleys than the sparse events generated by unbiased simulation. 

Figure ^ also demonstrates the agreement between the weight estimate 
discussed previously (used to select the depicted trajectories) and the un- 
biased results. The higher-weighted trajectories coincide strongly with the 
unbiased events. The large cluster of soft-ratcheted trajectories in the region 
110 < < 180 deserves comment. Because there is only a single unbiased 
event in that region, it is not obvious whether the relatively widely dispersed 
soft-ratcheted trajectories are "correct" — i.e., whether such an ensemble of 
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trajectories would be found in a long unbiased simulation, with many events 
in the region. Examination of the adiabatic energy surface (not shown) does 
indicate that the channel in question is indeed significantly wider than the 
two pathways crossing = 0, though perhaps not quite to the extent sug- 
gested by the soft-ratcheting trajectories of Fig. ||. 



4 Future Research 

Several means of improving the soft-ratcheting procedure are possible, of 
which we mention two. First, to increase the speed with which transition 
trajectories are generated — really, to decrease the waiting interval between 
crossing events — one can reduce the size of the threshold region (Sec. ^ 
in which purely unbiased dynamics are performed. The threshold region 
was intended to permit trajectories to explore a multiplicity of potential 
"exit points" from the stable state. However, the "softness" of the soft- 
ratcheting algorithm should, by itself, permit a substantial degree of this 
kind of exploration, and it may be possible to use a very small threshold 
region. 

Second, a more optimal (i.e., higher-scoring) ensemble of trajectories pre- 
sumably can be obtained by systematic estimation of parameter Acpo. In 
fact, the promising preliminary results presented in Sec. ^ were based on an 
ad-hoc choice. It is a simple matter to study in more detail an unbiased dis- 
tribution of A(f increments, and then use this data to systematically inform 
the choice of A(fo- Moreover, one can imagine attempting to bias trajecto- 



ries forward in a focussed conical region of dihedral angles |]23|, rather than 
simply according to (hyper)planes of constant A(p. 

Ultimately, it will also be important to compare the soft-ratcheted paths 
(which presumably represent the stochastic dynamics in a faithul way) with 
those generated by explicitly-solvated molecular dyanmics simulation. That 
is, how does the addition of explicit solvent alter the paths? Of course, 
this comparison will only be possible in small molecules like the alanine 
dipeptide and other small peptides, but it will provide a crucial validation of 
the technique. 
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5 Summary and Discussion 



We have given motivation and details for the "soft-ratcheting" algorithm 



18| for determining reaction pathways in molecular systems governed by 
stochastic dynamics. The method generates independent transition trajec- 
tories which will not be trapped in a single channel (pathway), and hence 
is capable of finding multiple channels. Although a final state is always tar- 
geted on average, the algorithm permits "backward" steps with a suppressed 
probability. The trajectories are thus ratcheted forward, but only softly: see 
Fig. m. The capacities of the approach were demonstrated in Figs. ^ and 
for an all-atom model of the alanine dipeptide molecule evolving according 
to overdamped Langevin dynamics with the AMBER potential ||25|| . 

Beyond rapidly generating multiple pathways, as other existing approaches 
are presently able to do p|, ^ |10|, the soft-ratcheting algorithm has the poten- 
tial also to estimate reaction rates and free energy differences via the dynamic 



importance sampling (DIMS) framework [^, ^ . The soft-ratcheting al- 
gorithm associates a score (see Sec. ^ with each transition trajectory it 
generates. The scores, in turn, may be used in principle to estimate kinet- 
ics and free energy differences. At present, however, we note that initial 
results showed that further parameterization and/or refinement of the algo- 
rithm are necessary before efficiency can be obtained in rate and free energy 
calculations. 
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